-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce LayerNorm optimization from latest Apex #277
Conversation
@Quentin-Anthony, thanks for this PR. But we do need backwards-compatibility, so please add a version check. |
Apex doesn't have versioning yet, so I added support to manually inspect the function and see if the Hopefully in the future NVIDIA/apex#1648 gets merged and we can just check |
Works for me. Thanks! |
Hi, Author of NVIDIA/apex#1715 here. Thanks for incorporate this into the repo (as the default)! This is very exciting. Moreoever, I'm writing to let you guys know that https://github.com/Quentin-Anthony/Megatron-DeepSpeed-MS/blob/046319fecccfb8053ad3de5181e48f943ff14d27/megatron/model/fused_layer_norm.py#L96C18-L96C75 also has the same memory_efficient feature in the same pr! |
@RuiWang1998, thanks for the information. @Quentin-Anthony, do you have bandwidth to handle this? |
Yep I'll take care of it |
Introduced in NVIDIA/apex#1715
My PR lets the user disable this LayerNorm optimization, but I suspect everyone will use it so it's on-by-default.
Not backwards-compatible with older Apex. Do you need a version check or is this ok?